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Abstract 

For decades, the Vlasov-Darwin model has been recognized to be attractive for particle-in-cell 
(PIC) kinetic plasma simulations in non-radiative electromagnetic regimes, to avoid radia¬ 
tive noise issues and gain computational efficiency. However, the Darwin model results in an 
elliptic set of held equations that renders conventional explicit time integration uncondition¬ 
ally unstable. Here, we explore a fully implicit PIC algorithm for the Vlasov-Darwin model 
in multiple dimensions, which overcomes many difficulties of traditional semi-implicit Darwin 
PIC algorithms. The finite-difference scheme for Darwin held equations and particle equa¬ 
tions of motion is space-time-centered, employing particle sub-cycling and orbit-averaging. 
The algorithm conserves total energy, local charge, canonical-momentum in the ignorable 
direction, and preserves the Coulomb gauge exactly. An asymptotically well-posed fluid pre¬ 
conditioner allows efficient use of large time steps and cell sizes, which are determined by 
accuracy considerations, not stability, and can be orders of magnitude larger than required 
in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and 
efficiency properties of the algorithm with various numerical experiments in 2D-3V. 


1. Introduction 

The electromagnetic (EM) Particle-in-ccll (PIC) method solves Vlasov-Maxwell’s equa¬ 
tions for kinetic plasma simulations Q0. In the standard approach, Maxwell’s equations 
are solved on a grid, and the Vlasov equation is solved by the method of characteristics using 
a large number of particles, from which the evolution of the probability distribution function 
(PDF) is obtained. The held-PDF description is tightly coupled. Maxwell’s equations (or 
a subset thereof) is driven by moments of the PDF such as charge density and/or current 
density. The PDF, on the other hand, follows a hyperbolic equation in phase space, whose 
characteristics are determined by the fields self-consistently. 

Here, we are interested in the Vlasov-Darwin approximation to the Vlasov-Maxwell set 
of equations, useful for low-frequency plasma applications in so-called non-radiative regimes. 
The Darwin model of electrodynamics, which is 0(y/c ) 2 approximation of the Maxwell’s 
equations Q, eliminates the light-wave propagation in the plasma. In doing so, the Dar¬ 
win model avoids unwanted electromagnetic wave excitation and related instabilities 0-0|. 
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However, the Vlasov-Darwin equations turn out to be more difficult to solve in practice than 
Vlasov-Maxwell, despite the fact that Darwin equations are seemingly simpler (without the 
2nd-order time derivative of vector potential, or the transverse displacement current). Math¬ 
ematically, this simplification fundamentally changes the character of the field equations from 
hyperbolic to elliptic. As a consequence, explicit time integration schemes (commonly used 
for Vlasov-Maxwell PIC algorithms) are unconditionally unstable Q. To overcome the diffi¬ 
culty, semi-implicit moment methods have been proposed |7;j (and have become the standard, 
see Refs. j8 17| and references therein) to time-advance the Vlasov-Darwin PIC system. 
However, the semi-implicit moment method is notably more complicated and difficult to 
solve than explicit schemes for the Vlasov-Maxwell equations, especially with non-periodic 
boundary conditions jlo|, 12, [0| 19|. 

Fully nonlinearly implicit PIC algorithms j(3, 2(3- 221 take advantage of modern iterative 
solvers, e.g. Jacobian-free Newton-Krylov (JFNK) methods |23| to converge the field-particle 
system nonlinearly. A tight nonlinear tolerance is enforced for convergence between particles, 
moments, and self-consistent fields at each timestep. Moment equations can be effectively 
used in the preconditioner stage of JFNK ( 22 I. 241 ]. to accelerate the convergence of the 


iterative kinetic solver. Discrete conservation theorems for total energy, local charge, and 
particle canonical momentum can be derived for those algorithms, which are attractive for 
long-time simulations. Particle orbit integration is subcycled [20, 22], resulting in much 
improved orbit accuracy, and allowing an efficient implementation on modern computer 


architectures [25j. As a consequence, these algorithms have shown the ability to overcome 




many difficulties of traditional semi-implicit PIC algorithms (e.g^, implicit-moment |26M3l|. 
direct-implicit methods (jjl-fjfil]. and Darwin implementations [3, [&. ITol . TH ) for accurate 
long-term kinetic simulations. 

The main objective of this study is to generalize the 1D-3V study in Ref. (22| to deliver an 
implicit, conservative Darwin-PIC algorithm in multiple dimensions. The algorithm employs 
the potential (0-A) formulation of the Darwin equations. Both held and particle equations 
are discretized using a space-time-centered finite difference scheme. As in ID |22|, the 
fully implicit character of the implementation is key to realize the desired conservation 
and performance properties. Particles are substepped for orbit integration accuracy, as 
particle time scales may be much faster than held time scales. Synchronization between 
helds and particles is accomplished by orbit-averaging j2cj|. We prove conservation theorems 
for global energy, local charge, particle canonical momentum, and the preservation of the 
Coulomb gauge (V • A = 0) on a uniform grid in a periodic plasma system. A moment-based 
preconditioner is formulated by taking the zeroth and hrst moment of the Vlasov equation. 
Fluid and ambipolar asymptotic regimes are handled effectively by the fluid preconditioner. 
The performance of the preconditioner is, however, limited by electron Bernstein modes, 
which are not captured by the simplified moment system employed. As a result, the kinetic 
solver is robust against variations in domain sizes as well as mass ratios, provided that 
electron Bernstein mode timescales are respected. In practice, for a given magnetization, 
this sets a lower limit for the electron-ion mass ratio. 

The rest of the paper is organized as follows. Section [2] introduces our formulation for 
the general Vlasov-Darwin model and its favorable properties. The model is specihed in 2D- 
3V and discretized with an implicit central-difference scheme in Sec. El where we propose a 
new automatic charge-conserving particle-moving scheme for multiple dimensions, and prove 
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theorems for the exact conservation of global energy and particle canonical momenta in a 
discrete setting, as well as the preservation of the Coulomb gauge. Section []] provides a 
detailed description of moment-based preconditioning for the JFNK kinetic solver. Numer¬ 
ical examples demonstrating the accuracy, performance, and conservation properties of the 
algorithm are presented in Sec. [5] Finally, we conclude in Sec. |6] 


2. Electromagnetic Vlasov-Darwin model 

The Vlasov-Darwin equations for a collisionless electromagnetic plasma can be written 


as 


S 0 0[39 


dtfa + V • V/q, + — (E + V X B) • V v fa = o, 
m a 

—V 2 A + j — eo<9tV</> = 0, 
ho 

e O V 2 0 + p = 0, 


( 1 ) 

( 2 ) 

(3) 


where f a ( r,v) is the particle distribution function of species a in phase space, q a and m a 
are the species charge and mass respectively, eo and /i 0 are the vacuum permittivity and 
permeability respectively, (j) and A are the t scalar and vector potentials, respectively. The 
electric and magnetic fields are defined uniquely from </>, A as: 


E = -V</> - d t A ; B = V x A. 


(4) 


The set of Darwin equations is driven by the plasma current density 



(5) 


Unlike Maxwell’s equations, the Darwin model does not feature Gauge invariance, and only 
the Coulomb gauge 

V • A = 0 (6) 


is physically acceptable (to enforce charge conservation). 

Note that the Vlasov-Darwin model (Eqs. [Ul6]) is overdetermined, as it has more equations 
than unknowns. It features two involutions ( 40 I . 4ll |: Poisson’s equation and the solenoidal 
constraint of the vector potential. (An involution is a constraint satisfied by the solution of 
the system at all times, if satisfied initially.) These two involutions are not redundant, and 
must be enforced numerically B, P- 359] to prevent spurious modes from being excited [0. 

To solve the above Vlasov-Darwin equations, we begin by realizing that the two involu¬ 
tions do not need to be enforced explicitly. In stead of Poisson’s equation, we consider the 
equation: 

e o <9 t V 2 0-V- j = 0, (7) 


found by taking the divergence of Eq. [2] and using Eq. [6] The two involutions are then 
implied by Eqs. |TJ [2] and [7] when the local charge conservation equation, 


dtp + V • j = 0, 


( 8 ) 
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(which is implied independently by the Vlasov equation, Eq. [T]) is satisfied. In particular, 
Poisson’s equation is implied by Eq. [3] and Eq. EJ The solenoidal constraint is implied as 
well, as can be seen by taking the divergence of Eq. [2] and using Eq. [31 to find: 


V • V 2 A = 0, 


from which, with appropriate conditions, Eq. [6] follows j42|. The derivation requires that 
V ■ (V 2 A) = V 2 (V • A) = 0, and the boundary conditions be consistent with V • A = 0 at 
the boundary [43| (i.e., they must enforce continuity of the normal component of the vector 
potential at the boundary). 

Equations [U El SI El and [7] constitute the minimal Vlasov-Darwin equation set of choice 
in this study. We emphasize that the main advantage of this set is that the two involutions 
(Poisson’s equation and the solenoidal constraint of A) are built-in, and thus do not need 
to be enforced or solved explicitly when local charge is strictly conserved. This property, 
when implemented discretely, will be most advantageous for both accuracy (it avoids spu¬ 
rious modes) and efficiency (it avoids the extra divergence-cleaning step via conventional 
projection methods H or hyperbolic cleaning |45|). Most importantly, this formulation 
avoids explicitly enforcing Eq. El which has been a critical implementation roadblock for 
the Darwin approximation in multiple dimensions in previous studies [7s, 10|. Carrying the 
involution enforcement to the discrete will require a very careful discrete treatment, however, 
and in particular one that strictly conserves local charge. 


3. Multi-dimensional, implicit, particle-based discretization of the Vlasov-Darwin 
model 

We employ a centered finite difference method to discretize the 2D-3V Vlasov-Darwin 
equations in Cartesian geometry on a uniform Yee grid (see Fig. [[]), which has N x and N y 
cells in the x and y directions, respectively. The field equations (Eq. [21 ED are written by 
replacing the derivatives with central difference schemes at the n + 1 /2 time level as 

! / Ww \ " +V ’ (&] .+V«\” +Va / «]<+■« \ ” +Va 

(^x "b ^y) I \Ay\i,j+ 1 / 2 I “1“ I \h /]*J+V 2 I — e 0 $t j 3y[ ( f ) \i,j+ i /2 I — 0(9) 

^ ' mk, J \\j z h J V o j 

e o + Sy)[(f)]ij /2 — (8 x [j x \ij + 5y[j y ]ij) n+/2 = (1,0) 

where the subscripts i and j denote cell index in the x and y directions, respectively, and 1 < 
i < N x , 1 < j < N y . The orbit-averaged current density j is found from particles as described 
below. We define Q n+1//2 = ( Q n+1 + Q n )/ 2, where Q is one of the unknown quantities. The 
finite-difference operation in time is defined at n + 1 /2 as St[Q] = ( Q n+1 — Q n )/At. The first- 
order derivative in space is defined at the center of the two adjacent values, e.g., 8 x [(j)\i + i/ 2 j = 
(0j + ij— (pij)/Ax. The second-order derivative in space is defined at the center of two adjacent 
first-order derivatives, e.g., <5 2 [0]ij = (S x [(p\i + y 2 j — i^j)/Ax. Similar definitions are 

set for quantities at cell faces, e.g., S x {j x ] itj = (j i+ i/ 2J - - ji-i/ 2j )/Ax, and (5 2 + 5y)[A x \ i+ i/ 2tj = 
T A-xi— i/ 2 ,j)/Ax “ T (Aj.j_|_i/ 2) j_|- 1 2A 3 .j_|_y 2 j’ T A-xi+ 1/2 j— 1 ) /Ay , etc. 
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Figure 1: 2D Yee grid and associated quantities. An integer index i or j denotes a cell center, and a half 
index i + x j 2 or j + x /2 denotes a cell face. 


The current density j in Eqs [9] [TO] is gathered from particles by orbit-averaging the 
individual current contributions, 


j ra+l/2 


Nv -1 


At 


v E r I/s * 


p ’ 


p i/=0 


( 11 ) 


where the index v denotes a substep, with subtimestep A t”, and N u denotes the number of 
substeps. Particle substeps satisfy A= At. The individual current components 

are gathered on the mesh (see Fig. [1]) from particles at every particle substep according to: 


(i Y +1 / 2 


tip,y) 


17+1/2 

ij+l/2 


Up,z) 


77+1/2 


where we define: 


AxAy p ’ x 2 y 1 

Qp y+l/2 C !7+l/2 


_7++ 1 / 2 O' 

AxAy p ' y 2 


x n 


- *0Si(!£ +Vi - 


^7»+ 1 /2)) 
l/j + l/2), 


gp 17+V2 

AxAy P ’ 2 



x f )Sr Vl (+ - Vj) + AS 22 ). 


S 2 +1/3 (x p - Xi ) = [*S 2 (x p +1 - x,) + S 2 (x p - x,)] /2, 
Sr V ’(+ - %) = [S 2 « +1 - Vj) + S 2 (y; - y,)} /2. 


( 12 ) 

(13) 

(14) 

(15) 

(16) 


For the second-order splines, we use the average of S 2 between the positions at v and v + 1. 
The 2D shape function is the tensor-product of the corresponding ID shape functions. A 
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ID shape function is second-order if the grid quantity is located at a integer grid-point, and 
first-order if it is at a half grid-point. Specifically, for a particle in a cell i, 


Si(x p 

Xi— 1 / 2 ) 

= 1 - 

x p - 

L 

Si(x p 

■P‘i+ 1 / 2 ) 

Xp 

- Xi- 


Ax 


{X p ~ Xi) 

3 

( X n 

S 2 1 

4 ” 

1 J 

V 

Co 

bO 

'iT 

0 

-k( 

V 


Ax 


yp 


(17) 

(18) 

(19) 

( 20 ) 


Note that the shape functions used here are dimensionless quantities. The interpolation for 
j z features a small truncation error correction: 


A§22 


S 2 (x u p +1 - Xi) - S 2 {x v p - Xi) S 2 (y p +1 
2 


Vi) ~ s 2(Vp ~ Vi) 
2 


( 21 ) 


which is needed for both exact energy (Sec. 13.211 and canonical momentum (Sec. 13.31) 
conservation. 

The Vlasov equation is solved by particles following their characteristics, which are gov¬ 
erned by the equations of motion: 


<%x p = v p , 

a 

dtVp &p (Ep T Vp x Bp), 
ni p 

where the subscript p denotes a particle (or Lagrangian) quantity. As in Ref. 
equations are discretized using a Crank-Nicolson scheme at v + x /r. 


( 22 ) 

(23) 


221 . 146] . these 


r v +1 


p = v ^+V2 


Art; 


T u-\-\ 


P — a Z/ + 1 /2 


A r v 

p 


(24) 

(25) 


The sub-time step is determined here by a second-order local error estimator At = 0.1 min (ay 


-l 


22], where u t = — \d^.(f)\ is the harmonic frequency of a trapped particle in the potential, and 


cu c = —B is the gyrofrequency. Here v l ' +1 / 2 = (v 12 + v u+1 )/2. 


The exact form of the acceleration term ap +1 ^ 2 is determined by held interpolations to the 
particle position (scatter) Q. Here, the electric held is time-centered, and, to enforce exact 
energy conservation, the scatter is performed exactly in the same way as the corresponding 
current components (Eqs. fl2lfl4|) . namely: 



fnv+y* 

p,x 


TTV+y* 

P,V 


T?v+y2 

^p,z 


E - *i^)s w A'\yp - Vi), 


I2+V2 / 


h3 


E K7.lA s A'^ x v - ^.)s,« +1/2 - »+./*), 


h3 


E e 7T k +,/5 (% - ^)sr i/2 fe - Vi)+ as 22 


h3 


(26) 

(27) 

(28) 
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(29) 


The scatter prescription for E PtZ again features the a small correction A § 2 2 for exact canonical 
momentum conservation. The electric field components on the grid are found from a discrete 
form of Eq{2 

~^y[4 > ]i,j+ 1 /2 — fit [Ay\i j+1/2 

s „ ~ fit\A-z\i,j 

Using the following property of the B-splines: 

dS 2 _ _ dS 2 _ _Si(x i+ y 2 - Xp +1/2 ) - Si(xi_i 

dx , u+1/2 , dx , v+1/2. Ax 

one can avoid computing the spatial derivatives of the electrostatic potential, and write Eqs. 
E6H27] as: 


V+l/2\ 

-1/2 ~ Xp ) 


(30) 


p,®,i+V 2 ,j 


E 


-C +1/2 ^ +Va fc - Vi) §| 

A n +l _ An 
^xi-Y- 1 /2,j ^xi 


(Xp + 1 / 2 -X i+1/2 ) 

n 

A( 


•U+ 1 / 2 ) 


(31) 


J^v- f 1 / 2 

p,y,ij+ 1 / 2 


E 


hi 


in+t/2 c u+ 1 /2 ( \ dS 2 


{y P +1/2 ~ 


yj+y 2) 


y 4 n + 1 _ ^4”- 

- JM+1/2 At y -^ S?‘% r - Vi)S t (^ - x i+t/2 ) 


(32) 


The scatter of the magnetic field components to the particles is done as follows. We 
assume that the vector potential varies linearly in time within a macro-step, i.e., 


AU+ 1 _ au An+l _ An 

z,i,j z,i,j _ z,i,j z,i,j 


A T u 

p 


At 


— E ■ 


and 


A 


A u+1 A v 

1/+1/2 _ z,i,j ~ z,i,j 


z,t,3 2 

The magnetic field is found by taking the curl of vector potential at the particle position: 


B px \ 

( d y A z 

Bpy I = 

~d x A z 

B pz ) 

\ d x A y - d y A, 


(x p ,y p ,Zp) 


where 


A x 

Ay 

Ax 


E 


Ki+A S A X - Xi+y 2 )sy' ,2 (y - vt) \ 


- Xi)SAy -»+v.) 

\ KtTs u 2 + ' , \x-x i )Sy' , \y-y 1 ) ) 


(33) 
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It is seen that V • B = 0 is always satisfied. It is worthwhile to point out that the above 
discretization and field scattering choices are unconventional (as compared to those normally 
used for explicit schemes, e.g. bilinear or area weighting and leap-frog stepping), and are 
motivated by conservation considerations, which will be detailed in the following sections. 

The particle equations of motion (Eqs. [24j [25]) are updated using an implicit Boris 
integrator fzj [47} , which is Picard-iterated as follows: 

1. v = v 1 ' + a fc E(x£ +1 ), 

2 ,+y 2 = v + Q= fc [v x B(x^ +1 ) + a k (v • B(x^ +1 )) B(x^ +1 )] 

1 + (a k B(^ +1 )y 

3. A r v k = min(A t £, dl x /v v x +1/ \ dl y /v v y +1/2 ), 

4. + 

where v denotes a intermediate step particle velocity, \ u+1 i 2 denotes the average particle 
velocity, x£ +1 is particle position, the superscript k denotes the Picard iteration count, 
a k = Ar^q/2m, and dl x and dl y are the distances to the cell boundary the particle is heading 
to in the x and y directions, respectively. The particle substep is iterated as well. This is 
important to ensure that particles do not cross cells within a sub-step, for reasons that will 
be apparent in the next section. The orbit iteration typically converges in 3-5 iterations for 
an absolute tolerance of 10 -12 . After convergence, we find v u+1 = 2v v+1//2 — v". 

We discuss next our strategy to enforce exact conservation of local charge, total energy, 
and particle canonical momentum. 


(34) 

(35) 


3.1. Local charge conservation theorem 

An exact local charge conservation scheme for ID implicit PIC, based on particle sub¬ 
stepping, was originally proposed in Ref. ||20j, and later extended to 2D implicit PIC in 
[48| . However, the 2D extension relied on using nearest-grid-point (NGP) interpolations of 
current components along the cell face. Here, we propose a new 2D PIC approach which 
avoids the NGP interpolation, and involves tensor products of ID shape functions of at least 
first order. 

Exact charge conservation requires that the continuity equation, 

| + V.J=0. (36) 

be satisfied at every grid point. After temporal and spatial discretization in 2D according 
to Fig. [U we have 


(n \ n+1 (n\ n a ) n+ /2 - a r + /2 (i ) n+ /2 - (i ) n+,2 

\P)i,j ~ VxJi+i/zj i / 2 j \Jy)i,j+ 1/2 Uy)i,j- 1/2 _ q 


(37) 


At Ax Ay 

Using the definitions for the orbit averaged current in Eq. [lTJ and noting that we can write: 


(P)g 1 ~ (P)?j 

At 


Ny-l 




v =0 


(P) 


, i u+1 

Ph,3 


(P: 


, \y. 

pit,3 


At; 
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it is sufficient for local charge conservation to satisfy the following charge conservation state¬ 
ment per particle substep: 

(n U +1 (n l 1 ' (i Y +1 ! 2 — (j Y +1 / 2 (j U + 1 / 2 — (i Y +1 / 2 

\Pp)i,j — \Pp)i,j \Jp,xJi+i/2,j \Jp,x)i-i/2,j \Jp,y)i,j+ 1/2 \Jp,y)i,j- 1/2 


A T u 

p 


+ 


Ax 


+ 


Ay 


= 0. 


Using the current components in Eqs. 
particles at every particle substep as: 


and defining the charge density gather from 


Mj = 


Q P 


-S 2 (xp- Xi)S 2 (yp -yj), 


we hnd: 


Si(x v p +1 - Xi)S 2 {y p 


AxAy 

“ +1 - vi) - Si K - Xi)Si(yZ - Vi) 


(38) 


A T u 

P 


Si(xp +1/2 - x i+ y 2 ) - Si(xp 


+v. 


V+^/2 _ \ 

- Vi ) 


+V. 


V+i/2 

Ax 

Si(rf + ‘ /2 - y i+ y,) - - Bj _, /2 ) s „ +1/2 


V+l/2 * 


' yp Ay 2 

By Taylor expansion within a spatial cell |2oj|, we can write: 

dS 2 


X p Xj / 


0. 


(39) 


S 2 (xi - Xp ) - S 2 (xi - x 1 ^) = 


dx 


(3:p +1/,2 — Xi) 


K +1 - <)■ 


Together with the property of the B-spline (Eq. I30lh we hnd: 

Si{x i+ i /2 - Xp +1/2 ) - Si(xi- 1 / 2 - Xp +1/2 ) 


S 2 {xi - Xp ) - S 2 (xi - x^) = 


Ax 


K +1 - <)• 


Using similar expressions for the y direction, and noting that v : 

„,"+ 1 / 2 _ I.„v ■ 1 


"+ 1 / 2 — („v+l 


xp 


= \X„ 


Xp)/A t p and 


= (Vp ~ O/Ar/, we finally hnd: 


S 2 {x" +l - Xi)S 2 (y p 


v+\ 


Vo) ~ S 2 (x v p - Xi)S 2 (y u - yj) 


At v 

p 


S 2 (x u p +1 -Xj) - S 2 {a%-Xi) +1/2 


Ay 


S 2 (y P - yj) 


S 2 (Vp +1 - yj) - S 2 (y u p - Vj ) s „ +1/2 


A t; 


\x p — Xj) = 0. 


(40) 


The identity in Eq. [40] is valid only when the particle trajectory lies within a cell. The 
above derivation motivates a charge-conserving way of pushing particles, namely, no particle 
sub-step crosses a cell boundary |20j, thus motivating the Picard algorithm in the previous 
section. The interpolations proposed in Eqs. rHZlTHTl [38l for p v j/,%^ 2 and jp^ 2 are similar to 
those in Ref. j49j|, but with one-order higher interpolations. 

We see that some of the moment gathering rules have been dehned to take advantage of 
the B-spline identity (Eq. I30lh so that the continuity equation is automatically satished. We 
will see that other consistent interpolations are also required for exact energy and canonical 
momentum conservation. 
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3.2. Total energy conservation theorem 

In the proof that follows, we assume a periodic system. As in earlier studies, we begin 
by dotting the particle velocity equation (Eq. [25]) with the averaged velocity Vp +1,/2 , orbit¬ 
averaging all substeps, and summing over all particles, to find: 

= E i E 


r v+1 + v 

T) ' Y ' 


At 


At 

1 


m r . 


V v l '+1 

p p 


V 

--At 1 ' 

At? p 


Ea - t J2 q p^p- Ep Y +1/2Ar P 


E a„Ah (T7’ n ~^ 1 / 2 0 -n +V 2 _i_ rp n + 1 l 2 • n + 1 / 2 I Tp n + 1 / 2 ■ n + 1 / 2 \ 

X y y x,i+ 1 / 2 ,jHx,i+ l / 2 ,j y,i,j+ydy,i,j- 1- 1 / 2 z,i,j 3 z,i,j J > 


where K = ^ \ m p v p is the total particle kinetic energy, and we have used the fact that 
the v x B force is always orthogonal to the averaged velocity in the Boris push. The above 
derivation requires that the shape functions for interpolating E (Eqs. fZhlOHll and j (Eqs. 
[HZUmi be identical. Introducing Eq. UHlfor the grid electric held components, we find that: 


At 


u 




n+i/2^n+i/2 
zi,j 


From Eq. [TO], the hrst group of terms on the right hand side yields after telescoping the sum 
(integrating by parts): 


^2 Ax Ay 

ij 

= Y AxAy$+ 1/2 ($x[jx]Y/2,j + S yih^j+ y a ) 

ij 

= Y AxA y€f /2 ( e o s t (sl + ^)[0]i,i)" +1/2 




= -s t 


£o 

2 


Y AxA y ( s Mlj + tyWilj) 


71+ 1 / 2 


= -StW^ 2 , 


where we have regrouped the discrete terms using periodicity, and defined a discrete version 
of the electrostatic energy as 

w* = I £ AxAb (SMj + mil) ■ 


(41) 


*7 


Using Eq. [9] in the second group, we End after telescoping sums that the terms associated 
with both </> and A vanish because of the discrete Coulomb gauge V • A = 0: 

60 >; AxAy [A-e] 44.1/2 ,j&t&x [ 0 ] i+1/2 ,j T ht [t^] "i.ji-l-i/s) 


V 


~^o ^ ^ AxAySt T hy [Ay] ij ) 




= 0. 


*7 
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As a result, the second group of terms on the right hand side can be written as: 


— V Ax Ay 
ho 


2ho 


S t ^2 AxA y 


2/i 0 


S t ^ Ay 


u 


(fit[-Ax]i+l/2,j(S x + <5y) [Aj;] 1+1/2J 

+^[A y ]jj + i/ 2 (5^ + 

+ + 1 
{\&xA x \i +1 / 2 j + [<5j,A x ] i+ y 2J - 

+ [^X^J/]i,j+ 1 / 2 + [<5j/A/]i,j+V2 
+ + [SyAzfij) + / 

([^a:^y]ij+l/ 2 — [3yA x ]i+l/2,j) + [^x-AJ^- + [^J/Aj.]^- 


= -s t w B \ n+1/2 , 


where in the second step we have added and subtracted 2S x [Ay\ i j + i/ 2 S y [A x ] i+ i/ 2 j 1 and used the 
discrete Coulomb gauge S x [A x ] lJ+ i/ 2 + S y [A y \ i+ y 2J - = 0. The discrete version of the magnetic 
held energy is therefore defined as: 


Wb — ^ Ax Ay ([5 x A J/ ]j j _j_|_y 2 {S y A x \ i+ ii 2 j) + [5 x A z ]jj + [(^A^y 


(42) 




A discrete version of total energy conservation in the Vlasov-Darwin system [50] follows: 


S,(K + W* + W B )\ n *'l = 0. 


(43) 


3.3. Conservation of particle canonical momentum 

In 2D, the electromagnetic system has an ignorable direction, say z, and the associated 
particle canonical momentum p = mv + qA should be conserved, per particle, for all time. 
This is a consequence of the particle Lagrangian Af = mv 2 /2 + g(v-A —0) being independent 
of the z coordinates, as can be shown from the Euler-Lagrange equations (Hlj : 

d ( dSe \ - dS£ 
dt \ dv z ) dz 

The canonical momentum is defined as p = and hence is clear that: 

Pz = o. (44) 


We seek to enforce this conservation property numerically in our particle orbit integrator. 
As we shall see, this will constrain the form of the scattering of the electric held to the 
particles, and the gathering of the current (to conserve energy). We begin by writing the 
conservation of p z as: 

mv PjZ + q p A Z:P = 0, (45) 


11 










where 


(46) 


Az,p — ^ 2 A z/ j-j A 2 i-l'p *U) *5*2 (l/p J/j)- 




Equation 05] can be integrated over the substep u to v + 1, to find : 

(iripVp ~h q P A p ) z (rn p v p T q p A p ) z 0. 
Equation 07] can be rearranged as : 


(47) 


^,^+1 - yV 

P)Z p,z 

A T" 

p 


Jp_ 

m n 


E 




A^ 1 S 2 (x^ 1 - Xi)S 2 (y u p +1 - y.j) - A" zij S 2 (x" - Xi)S 2 {y" - %■) 


At" 


(48) 

which can be casted in the form of the implicit Boris pusher (Eq. 04] 071) as follows. Using 
the B-splines identities (Eq. [30]) and Taylor-expanding the shape functions, we End: 

A ^i!j S 2 ^p +1 ~ Xi)S 2 (y p +1 - yj) - A" z ^S 2 (x" - Xj)S 2 {y" - yj) 

A T u 

(§22 + AS22) + _ #iMr * 


Art 


At 


p 


-52(|/; +1 — 2/y) — S 2 {Vp ~ yj) qU+ i/ 2 


z,*J 


,+Va 


At 


^ Op - X i)A ZM 


A v+1 — A" 

z,i,j z,i,j 

At" 


[S 22 + A§ 22 ] + Vpt.'' 2 


p 

dS 2 


p ' x dx 

..,+y 2 ds * 

J p,y 


S2 +1/ \y P - vMl, 

S 2 +1/2 (y P - yj)A u z fh 


.,+!/2 


dy 


(Xi-Xp +1/2 ) 


( ^“hl/2 \ 

\Vj ~Up ) 


where: 


§22 = S u 2 +1/ \x p - Xj )S v 2 +1/ \y p - yj ) 


AS 22 = 


S 2 (< + - Xi) - S 2 (x" - Xi) S 2 (y p - yj) - S 2 (y" - yj) 


It follows that, 


v" + z l — v" z 
At" 


m n 


Qp_ 

m„ 


A" +1 — A" 

E - ™ (S 22 + AS 22 ) 


V. 


1,3 
u+1/2 


At, t 


+ 


(49) 

(50) 

(51) 

(52) 


p,x 


E (-<t ? d * s A - <A E -htf+s 22 


hi 


i,3 


Clearly, the second term in the right hand side of Eq. 07] is the Lorentz force. The Erst 
term provides the modified shape function for scattering the ^-component of the electric 
field (Eq. [28]) . and the AS 2 2 correction is 0 (At ") 2 (commensurate with the truncation error 
of the finite-difference scheme). The corresponding current density component (Eq. fl4l) . as 
advanced earlier in this study. 
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3-4- Binomial smoothing: impact on conservation properties 

As in earlier studies fit., (2oj, in periodic systems we apply binomial smoothing to re¬ 
duce noise level of high k modes introduced by particle-grid interpolations jl]. Smoothing 
preserves the conservation properties of the implicit Darwin model when implemented ap¬ 
propriately. The governing Darwin-PIC equations with binomial smoothing read: 


1 

ho 


02 + O 


\^-x\ i+y2,j 

[Ay\i,j+l/2 


n+i/2 


+ SM 


\jx]i+l/2,j 

{jylhj+yz 

\jz\i,j 

e O + K 


n +!/2 


n+1/2 
h3 

^p p 


v^+1 _ 

p _ _p_ 

At" 


A T v 

(SM(E n+1/2 ) l 


m 


\" +,/2 


^0 ( ^ 2 /[ 0 ]i,j+ 1/2 I 

,(53) 

V 0 / 


SM (S X [j X ]iJ + Sy[jy]iJ 

)”1&) 

v u+1/2 

x,p 1 

(55) 

V ; +1/2 x SM{ B" +1/2 ) p ) 

. (56) 


The binomial operator of a grid quantity Q in 2D is defined as the tensor product of the 
binomial operator in ID: 

SM(Q) 4i = SM(Q) i SM(Q) j: (57) 

where: 

SM(Q)i = ^-i + 2 ^ + ^+i ( 58 ) 

Smoothed particle quantities are defined as: 


SM(Q) P = ^ SM(Q)ijS(x p - Xi)S(y p - yj). (59) 

i 


Owing to the binomial smoothing property that 'Yh i AiSM{B)i = '^2, i BiSM( y A)i in each 
periodic direction, it is straightforward to show that energy and charge conservation theorems 
remain valid |20|. Canonical momenta conservation also survives when replacing A Z}i j by 
SM(A z )ij in the previous section (as the derivation is mostly based on Taylor expansion of 
the shape functions). 


4. Moment-based preconditioning of the multi-dimensional Vlasov-Darwin PIC 

solver 

The final set of equations solved in this study is comprised of the set of field-particle 
equations, Eqs. [5311561 We invert this system using a JFNK solver with nonlinear elimination, 
implemented and configured as described in Ref. [20]. In particular, the particle equations 
are enslaved to the field equations (so-termed particle enslavement) in a way such that only 
field variables (0, A) are involved in the nonlinear residual. An advantage of this approach 
is that only a single copy of the particle variables is required (as in explicit PIC algorithms), 
and the memory footprint of the nonlinear solver is determined by the low-dimensional field 
variables. This results in memory requirements for the nonlinear solver comparable to fluid 
simulations. 
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For practical simulations, the convergence JFNK must be accelerated for efficiency. For 
Krylov iterative methods, this task is performed by the preconditioner. In the context of 
implicit PIC simulations, we seek an inexpensive approximation of the linearized kinetic 
solution. The basic idea is “physics-based”, i.e., we employ the linear response of p and j, 
as obtained from approximate, linearized moment equations [7] to advance the linearized 
electromagnetic fields. This idea is at the root of semi-implicit moment methods (26-31 , 
and has already been successfully explored to some degree in fully implicit ID PIC j22 l24i| 
(with limited models for the current response). Here, we generalize the fluid preconditioner 
to multi-D, and consider a general current response for moderately magnetized plasmas 
(i.e., l o pe > ui ce , namely, the electron plasma frequency is larger than the electron cyclotron 
frequency). We will demonstrate that the so-derived preconditioner features the correct 
ambipolar and MHD asymptotic responses, and is therefore suitable for arbitrary mass ratios 
and system lengths, provided that the plasma does not become strongly magnetized. 


4.1. Formulation of the moment-based preconditioner 

The preconditioner development begins with the linear Jacobian system resulting from 
the Newton-Raphson iteration, which can be written as 


J(u k )5u k = —R(u fe ), 


where = (0, A) denotes the current state solution vector, J(u^) is the Jacobian matrix, 
and R(ufc) is the nonlinear residual. The linearized field equations read: 

= -Ra, (60) 

= ~R^ ( 61 ) 

where the right-hand-sides are the residuals of Eq. [2] and [31 respectively, and Sj is the linear 
kinetic current response (obtained from particles). Here and in what follows, the J-terms are 
small linear quantities (which should be distinguished from finite-difference operators St, S x , 
etc.). Accordingly, the Jacobian matrix may be written in block form as: 


i n , ri V Sf 

—V 6A + Sj — e 0 . 
p 0 At 


At r 


V • dj 


J(Ufc) 


U^a 
La,</> D a 


Due to the presence of particle interpolations and orbit integrations, the explicit form 
of these linear operators is extremely cumbersome to formulate, and impractical for pre¬ 
conditioning purposes. Here, we pursue an alternate route, where the current response is 
estimated from an approximate moment system. As in previous studies ( 22 I . 2jl |. we begin 
with the first two moment equations (for each species): 


&T_ 

dt 


dn „ „ 

~St +v ' T 


a 1 

— (nE + T x B) -|-Vp 

m m 


0, 

0, 


(62) 

( 63 ) 
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where n is particle density, T is particle flux, and p is scalar pressure. We have neglected the 
convective and stress tensor terms. The continuity and momentum equations are linearized 
as: 


QSn 

-^ + V ■ hr = 0, (64) 

— (nhE + hnE + r x hB + hr x B) - —V(T5n) = 0, (65) 

ot m m 

where T is an effective temperature obtained from particles. The terms without the h symbol 
are available from either the current Newton state or from the current particle state. The 
orbit-averaged linear current response is estimated as hj = 'Yh s=ei q s b T s , with some caveats 
that will be explained below. 

After temporal and spatial discretization, our preconditioner approximately inverts the 
system of Eqs. E0 EU E3J and 1651 to find hu = (hA,hc/>). In principle, these equations are 
fully coupled, and themselves a challenge to invert [52| . It is, however, possible to decouple 
them by considering the weakly to moderately magnetized regime, u pe > u ; ce . In this regime, 
the electrostatic response is faster than the electromagnetic one, and one can neglect the 
feedback of electromagnetic evolution on the electrostatic response. In practice, this means 
one can neglect U in the preconditioner, i.e.: 


P(Ufc) ~ 


D (/, 0 

La,^ Da 


where the tilde indicates that the linear operators are modified after considering the linear 
moment closure. As a result, we can decouple the electrostatic potential solve and the vector 
potential one. This, however, implies that the preconditioner will be most effective for weakly 
and moderately magnetized plasmas. 


4-2. Implementation of the moment-based preconditioner 

Equations EH ESI are time-discretized by time-averaging the equations over a time interval 
of [0, At] (by applying A. f At dr) 0: 


—+ V-^f = 0, (66) 

—— (nhE + 5nE + f x 5B + <5f x B)- —V (TSn) = 0, (67) 

At m 2 m 

where all quantities are time-centered (at the half time level) except for Sn (at the integer 
time level). Accordingly, we approximate the time-derivative terms as: 


1 f At , ddn 5n 
At J 0 dr ~ At ’ 

1 r At dST 25T 

At J 0 dT nr " Ar¬ 


ies) 

(69) 
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The integration has been performed assuming that the current state solution does not change 
with r. Equation [67] for the species s can be formally inverted as: 


oi s 


SFJ + MF; X Bo + a 2 (<5F 6 T • B 0 )B 0 ] 
1 + (a s B 0 ) 2 


(70) 


where a s = Atq s /2m s , and hF s = n s <5E + <5n s E + r s x 5B — . Therefore, the 

approximate linear current response is given by: 


~ qs6T * = 5Z q * 


a 


SFJ + [a.dF- x B c 


o, 2 (5F; ■ B 0 )B C 


1 + (a s Bn 


(71) 


The implementation of the preconditioner is as follows. We begin by solving for the elec¬ 
trostatic response, which is decoupled from the electromagnetic one owing to the moderately 
magnetized assumption. For this, we invert the coupled linear operator: 


with: 


= -R t , 

(72) 

8n„ 

—- + V-8T ( j )tS — 0, 

(73) 



V7^x j? V7J. V{T s 8n s ) 

= —n s \8(p — 8n s \7 <p -. 

2 q s 

(74) 

8Tl s 

SF;,. + MF;, X B„ + a 2 (5F^, - B„)B„] 

= CV 

s 1 + (ot s Bo) 2 

(75) 

s'u 


(76) 


S 


Once S(j), 8n s are found, we solve for the electromagnetic response 5 A from: 

—V 2 <5A + <5j A = -R A + eo ^-5U, (77) 

2 fi 0 At 

with: 


<5jA 

5F i)i + MFi , x B„ + a ;(5Fi ,. B 0 )B„] 

_ / ; i , / pH 1 

“ 1 + {a s B o) 2 

(78) 

SF A,s 

8 A 

= — n s — -1- r, x V x 8A. 

At 

(79) 


The procedure described above does not guarantee V • <5A = 0. This can cause stalling 
of the convergence of the nonlinear residual, which must satisfy the solenoidal involution 
exactly upon convergence. To prevent stalling, it is necessary to divergence-clean 8A in the 
preconditioner. For this, we consider 8A' = 8A + VT, and solve for T from V • 8A' = 0 as: 


V 2 T + V ■ 8A = 0. 


(80) 


This works discretely because T is defined at cell centers, and 8A at cell faces. 
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4-3. Asymptotic properties of the preconditioner 

The computational complexity of the moment-based preconditioner is substantially re¬ 
duced compared to the original kinetic solver. If successful, the preconditioner can deliver a 
key algorithmic advantage and important computational savings. However, whether the pre¬ 
conditioner is successful or not rests critically on whether it features the correct asymptotic 
limits. 

In practical simulations of interest, there will be regions where kinetic effects are impor¬ 
tant, but others where MHD fluid models will be appropriate descriptions. To achieve a 
true multiscale character, the implicit PIC algorithm must be able to deal with these lim¬ 
its successfully. This, in turn, requires the preconditioner to be able to span the relevant 
asymptotic regimes seamlessly. Here, we concern ourselves with two asymptotic limits of 
practical interest for multiscale simulations, namely, one spatial (large domain sizes L , much 
larger than kinetic scales, which tests the transition to fluid regimes), and another temporal 
(massless electrons m e —y 0, which tests the transition to ambipolarity). 

The asymptotic properties of the preconditioner are determined by the behavior of the 
approximate linear current response in Eq. [7U as it embodies the plasma response to changes 
in the electromagnetic fields. We will demonstrate in what follows that the form chosen in 
Eq. [Tl] does in fact have the correct asymptotic behavior in both limits. For analysis, we 
reconsider the current response of a single species: 

_ hF; + [hF; x a,B 0 + • B 0 )B 0 ] 

1 + (a.Bo) 2 

<5F; = «[„,R| + fx (5B — 


(81) 

(82) 


We begin by normalizing using Alfvenic units (i.e., arbitrary length L, density no, magnetic 
field B 0 , mass m 0 , and the Alfven speed va = B 0 /In these units, we can write 
the normalized current response (indicated by a hat) as: 


Ah 



„ ^F; + [( <*.Bo)6F - 

Oi s B, o)- 

q s 5 (n s E) + q s t s x < 5 B - 


xB 0 + (a s B 0 )\5Fj -B 0 )B 0 \ 
1 + (a s Bo) 2 

V5p; 

2 


(83) 

(84) 


It is clear that the main dimensionless parameter is a s B 0 = 0.5Atco C}S = At/d s ^/m s , where 
d s = c/Lu PtS is the normalized ion skin depth. This parameter controls both temporal (via 
At or m e ) and spatial (via L) asymptotic limits. 

As stated above, we are interested in two distinct asymptotic limits: 1) massless electrons 
(m e —y 0, a e B 0 —> oo) and 2) large domains (L —y oo, a s Bo —y oo for all species). The former 
corresponds to the stiff limit when the plasma frequency is arbitrarily fast and the plasma 
becomes ambipolar, and the latter to the transition to fluid-relevant regimes. We investigate 
these next. 
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4-3.1. Massless-electrons limit 

In this limit, a e Bo 1, with a s Bo arbitrary for other species. Considering the current 
response from Eq. [83] for electrons and taking this limit, we find: 

4 « 8F~ x Bo + (a e B 0 )(5F~ • B 0 )B 0 . (85) 

The first term corresponds to the perpendicular current response, while the second term 
corresponds to the parallel current response. The first term contains terms that correspond 
to electron drifts (E x B, Vp, etc.), which is the correct ambipolar response. The second 
term is proportional to a e B o 1, and may seem asymptotically ill posed. However, when 
considering the full field response equations (Eqs. [60] l6T]h it is clear that this term will 
force the parallel current response to be of ^[(a^Ho) -1 ] ~ y/ml 1, which is the correct 
ambipolar response in the absence of collisions. 

We conclude that the fluid preconditioner behaves regularly when electrons become mass¬ 
less, with one caveat: the preconditioner will lose effectiveness whenever the electron mass 
is small enough to violate the moderately magnetized assumption (i.e., c o pe > uj ce ) embedded 
in its formulation. We will confirm that this is indeed the case in the numerical experiments 
section (Sec. [5]). 

4-3.2. Large-domain limit 

In this regime, oi s Bq 1 for all species, and therefore: 

4 « 8F~ x Bo + (a s B 0 )(8F; ■ B 0 )B 0 , 
and the total current response is: 


= xBc 


J2(a,B„)6FJ - B f 


B 


o- 


But: 





E 


s;0 



x SB - ivhp. 


Here, q s h s ~ 0 because of quasineutrality (which is enforced in the preconditioner by the 
solution of the electrostatic potential equation, Eq. IfiTTh It follows that the large-domain 
current response is: 

<y ~ + 4> 

with: 


fi-L = 

«l = 


j x 8B - -W5p 


x B 0 , 


J2M)SP; ■ Fi, 


B r 
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It is clear that (up to factors of l /2 due to the temporal discretization of choice) the per¬ 
pendicular current response essentially follows the MHD response, as obtained from the 
linearization of j x B = Vp. The parallel current response essentially forces the parallel 
component of hF~ to be of ff[(a s Bo)~ 1 ] ~ L~ l <C 1, which is also consistent with the 
MHD response. 

We conclude that the fluid preconditioner recovers the fluid MHD response in the limit 
of very large domains, and is therefore asymptotically consistent with a fluid description in 
this limit. Numerical experiments in Sec. ED will verify that the preconditioner performs well 
for arbitrary domain sizes. This behavior is key for a multiscale particle kinetic algorithm. 

4-4- Non-local response: strict moment differencing in the preconditioner 

Despite the excellent asymptotic properties of the preconditioner, in certain contexts a 
direct finite-volume discretization of the linearized moment equations can become ineffective 
as a preconditioner when the grid size is larger than the electron skin depth. This perfor¬ 
mance degradation can be quite limiting in terms of efficiency for certain applications. We 
address the origins and the solution to this issue next. 

To understand the origin of the problem, we first take a look at the preconditioner vector 
potential equation in the weakly magnetized limit (i.e., when a s B 0 -C 1), which from Eqs. 

[73 [791 gives 0 : 


V 2 hA-hA^-^2/i 0 


-Ra + Co 


V<fy 

At 




( 86 ) 


For Ax < d e , the first term (Laplacian) dominates, and the preconditioner remains effective. 
However, for Ax > d e , the second (inertial) term dominates. The second term implies an 
instantaneous local response between the plasma current and the vector potential, which 
is not completely accurate due to particle-mesh interpolations. This is clearly seen when 
the response is obtained directly from the moment definition from particles, e.g., Tp = 
'f2p v Py S( x — x p )/Ax in ID. Approximate linearization gives: 


AF « 

UL i y ~ 


y 8v P yS (Xj - Xp) /Ax 


mA , r ^ py S(Xi X p ) 


~ x p) s ( x * - x p)> 


where we have neglected the linear response from the shape function (which is consistent 
with our neglecting the pressure gradient in Eq. I86lh We see now that the linear current 
response is non-local: it has contributions from nearby cells according to a “mass-matrix” 
of shape functions, which numerical experiments show are critical for the efficiency of the 
preconditioner for Ax > d e . This non-local nature of the coupling due to particle-mesh 
interpolations has been discussed in earlier studies on the direct implicit method 35), where 
it was termed “strict-differencing.” 
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The analysis becomes much more complex when magnetic fields and other terms are taken 
into account, and the resulting equations become difficult to solve. What we do instead is 
to use Eq. [TO] for all species to compute a local response Jj, and then apply the mass matrix 
to account for non-local effects. More specifically, in 2D we filter the current linear response 
according to: 


^s,ij — ^2 


5T 


s.lm 


-/V-t) 1 ryi 
Lm v ' lm 


53 Xi- x„)S(y m - y r )S(xi - x r )S(yj - y r ), 


(87) 


where N p j m is the number of particles in the (l,m) cell. Computing the mass matrix ac¬ 
cording to the actual particle positions at every preconditioner application can be quite 
expensive. Instead, we assume the particles are uniformly distributed in each cell, and 
precompute analytically the mass matrix elements as: 


^S(xi - x p )S(y m 

p 



y P )S{xi 


x)S(y m 


x p)S(yj y P ) 

y)S(xi - x)S(yj - y) dxdy. 


where 1 < a < N x and 1 < b < N y , N P)a b is the number of particles in the (a, b) cell, 
and flab — AxA y is the cell volume.We use these analytical coefficients throughout the 
simulation. This strategy has worked quite well in the numerical examples presented here. 


J h 5. Solver implementation in the preconditioner 

We solve the two steps in the preconditioner (8(f) step, Eqs. [72] and (73] and 8 A step, 
Eq. [771) , each coupled with the corresponding linear current responses (Eqs. [74] and [751 
respectively), using a classical multigrid (MG) solver. Both the 8(j), 8A update systems 
involve coupled PDEs, which we solve together in our MG solver. For a smoother, we 
employ a block damped Jacobi iterative method, with the damping constant equal to 0.7. We 
employ several MG V-cycles, with 5 smoothing passes in both restriction and prolongation 
steps (V(5,5) in MG jargon), until the linear residual is converged to a tolerance of 10 -2 for 
each system. 

While we do not have a mathematical proof that damped Jacobi is a good smoother 
for these systems, we note that the staggered nature of the grid keeps the mesh stencil 
very tight, and this provides the necessary diagonal dominance for the Jacobi smoother to 
perform adequately. The strict differencing discussed in the previous section spreads the 
stencil somewhat, but it is very diagonal dominant by construction, and does not seem to 
affect the MG smoothing step. Overall, our MG solver has performed very well in all the 
numerical examples presented in the next section. 


5. Numerical tests 

We have developed a 2D code in Fortran which employ the implicit algorithm developed 
in this study. For comparison, we have also developed an explicit Vlasov-Maxwell code that 
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also employs the potential formulation, conserves local charge, and enforces the Coulomb 
gauge exactly (see Appendix 

In this section, we consider two test cases for benchmarking and demonstrating the 
favorable properties of the 2D-3V implicit PIC algorithm: an electron Weibel instability and 
a kinetic Alfven wave instability. These test cases are stiff multiscale problems, with the 
instabilities growing at a much slower rate than fast time scales supported by the system 
(e.g., the electron plasma wave frequency). For verification, we compare implicit simulation 
results against explicit ones, and we verify linear growth rates. We report several conservation 
diagnostics, including global energy, local charge, total momentum, and total canonical 
momentum. We carry out temporal rate-of-convergence studies, which demonstrate the 
second-order temporal accuracy of the scheme. The performance of the preconditioner is 
assessed, and we monitor the CPU time of implicit computations compared with explicit 
ones. In our numerical experiments, we report implicit vs. explicit CPU time speedups 
larger than 10 4 . 


Appendix A). 


5.1. The electron Weibel instability 

The Weibel instability test case is a weakly magnetized example. The setup is simi¬ 
lar to that used in Ref. (22j|, except that the configuration space is now 2D, and some 
changes in parameters have been made. Electrons are initialized with an anisotropic Maxwell 
distribution with T ey ^ z /T ex = 9, and the thermal velocity parallel to the wave vector is 
VeTx = y/T ex / m = 0.1. Ions are initialized with an isotropic Maxwell distribution with 
ViTx = 0.1. The timestep is taken to be At = 1 (cn^ 1 ), which is about 14 times larger than 

the explicit CFL (~ 1/cy/^Uj + The simulated domain has L x x L y = n(d e ) x n(d e ), 

with 32 x 32 uniform cells and periodic boundary conditions. The average number of parti¬ 
cles per cell of each species is 2000. A Ufunction-like perturbation is introduced by shifting 
the velocity of all the electrons in one cell by a small amount: 

v p = v p0 + a (88) 

where v p o is particle velocity sampled from the Maxwellian distribution, and a = 4 • ICR 2 is 
the perturbation level. 

For comparison, the maximum linear growth rate (7 = 0.1017) supported by the domain 
size is found from the dispersion relation of electromagnetic waves in a bi-Maxwellian plasma 
53l: 



where a = e,i, = c o/k x ^2T ax /m a , and Z'(£) is the first derivative of plasma dispersion 
function. The excellent agreement between the simulation and theory is shown in Fig. 
[21 The time history of conserved quantities (e.g., charge, energy, momentum, canonical 
momenta, and V • A) of the simulated system is depicted in Fig. [31 for both implicit and 
explicit computations. We see that charge conservation is preserved at round-off level for 
both explicit and implicit algorithms. For the implicit computation, energy conservation 
is controlled by the JFNK nonlinear tolerance (a relative tolerance of ICR 6 in used in this 
study), and the canonical momenta conservation is controlled by the Picard tolerance level for 
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Figure 2: Time history of the magnetic field energy evolving from an electron Weibel instability. Excellent 
agreement is found between explicit (with 32 x 32 uniform cells, At = 0 . 05 ), implicit, and the theoretical 
linear growth rate. In the linear stage, the magnetic field energy grows as Wa = T / FAo ex p(27w pe i)- 


orbit integration (an absolute tolerance of 10 -12 is used). With respect to exact energy and 
canonical momentum conservation, we see that the implicit computation out-performs the 
explicit one by many orders of magnitude. As in earlier studies |20|, the particle momentum 
in the ^-direction is not conserved exactly, but the error is relatively small. The momentum 
conservation is slightly better in the explicit computations, likely due to the use of a much 
smaller time step. Finally, as expected, V ■ A is well conserved in both algorithms. Explicit 
results are close to round-off level; implicit results are dependent on the nonlinear tolerance, 
owing to exact charge conservation and our involution-free Vlasov-Darwin formulation. 

5.1.1. Temporal convergence study 

We have performed a temporal convergence study of the CN-based implicit PIC scheme. 
Numerical experiments use a fixed Ax and a series of timesteps (At). We record the solutions 
at final times, t = 8. Relative numerical errors are obtained by comparing the result of these 
solutions with a reference solution (obtained by using a small timestep At = 4 x 10~ 4 ). 
Results are shown in Fig. |4] We confirm the second-order temporal scaling of the error for 
both A and as expected from the time-centered CN scheme employed in the algorithm in 
both the held and particle governing equations. 

5.1.2. Preconditioner performance 

We use the Weibel test example, with various mass ratios and grid sizes, to evaluate the 
performance of the fluid preconditioner proposed in Sec. [4l We test the solver performance 
with and without preconditioning. The number of linear and nonlinear iterations per time 
step are monitored and averaged over 10 timesteps. A key figure of merit is the number 
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Figure 3: Conserved quantities in the simulation of the electron Weibel instability. Charge conservation 
is measured as the (root-mean-square) rms of the continuity equation, numerically evaluated at grid cells 


Hi= , i(/ 9 ™ +1 — Pi + At(j i+ i/ 2 — ji_ i/ 2 )/Ai) 2 , where N g is the number of grid-points. Energy conservation 
is measured as the accumulated change in the total energy (c.f. Eq. mi) with respect to the initial one. 
Momentum conservation in the x direction is measured as fnpVp tX / E,„ rn p Vth,x , with p the particle index 
respectively. The maximum error in the conservation of canonical momenta for all particles is measured 
as maxp (| m p v p +1 + q p A^ +1 — m p v p — q p Ap |) in the z direction. Finally, the rms of V ■ A is found as 

\/E;/ \(A X i+i/2,j )/Ax 4- {Ayij+ 1/2 v4y^j_iy 2 )/Ai/)] . 
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Figure 4: Numerical convergence rate of the CN scheme. Second-order scalings are verified for both </> and 
A. The y-axis is defined as the relative error [Q(At) — Q r ef]/Qref, where Q is the I/ 2 -norm of the solution. 
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Table 1: Solver performance with and without the fluid preconditioner for the electron Weibel case with 
L x x L y = 22 x 22 ( d 2 e ), N x x N y = 128 x 128, and N pc = 200. For all the test cases, At = O.lw” 1 . The 
Newton and GMRES iteration numbers reported are per time step, averaged over 10 timesteps. 


milm e 

no preconditioner 

with preconditioner 

Newton 

GMRES 

Newton 

GMRES 

25 

5.8 

192.5 

3 

0 

100 

5.7 

188.8 

3 

0 

1836 

7.7 

237.8 

4 

2.8 


Table 2: Solver performance with and without the fluid preconditioner for the electron Weibel case with the 
various mesh sizes (keeping mi/m e = 1836 and other parameters the same as used for Table [lj. 


N X X Ny 

no preconditioner 

with preconditioner 

Newton 

GMRES 

Newton 

GMRES 

16 x 16 

3.7 

20 

3 

0.9 

32 x 32 

4 

38.5 

3 

0.9 

64 x 64 

4.3 

79.9 

3 

0.2 


of function evaluations {Nfe)i found from the sum of linear and nonlinear iterations (Nee 
equals the number of calls to the particle integration routine). 

Table [T| shows the solver performance with respect to the mass ratio. The number of 
nonlinear (Newton) iterations is a function of the nonlinear tolerance (which is 10~ 6 in this 
study), and is about 4 to 7 upon convergence. For the unpreconditioned case, the number 
of linear (GMRES) iterations increases with the ion-electron mass ratio. This is expected, 
because the problem becomes stiffer (the electron plasma frequency becomes larger) when 
At is pegged to the ion plasma frequency. With preconditioning, the solver convergence rate 
is essentially independent of the mass ratio, confirming the asymptotic analysis in Sec. 14.31 
We note that, because we use the preconditioner to provide the initial guess for the GMRES 
solver, sometimes no GMRES iterations are needed for convergence. The improvement in the 
solver convergence rate (measured by Nee ) achieved by the fluid preconditioner is between 
30 and 60. 

Table [2] shows the solver performance with respect to the grid resolution for mj/m e = 
1836. While the unpreconditioned case is very sensitive to grid refinement, the preconditioned 
solver is not sensitive at all (owing to the use of MG solvers in the preconditioner). The 
improvement in the solver convergence rate achieved by the fluid preconditioner is again 
large, of more than two orders of magnitude. 

As will be demonstrated in the next section, the preconditioner performance is also robust 
against variations in the domain size: the number of Newton iterations is about 3 to 4 and 
the number of GMRES iterations is about 1 to 3 when we vary the domain size L by three 
orders of magnitude. This also confirms the asymptotic analysis in Sec. 14.31 


25 

























Figure 5: CPU speedup of implicit vs. explicit PIC for the electron Weibel instability case as a function of 
kXp ~ A d/L- Speedups of several orders of magnitude are possible for large domains (Ad/L -C 1). 


5.1.3. CPU speedup of implicit vs. explicit PIC 

The efficiency advantage of the implicit PIC approach vs. the explicit one is summarized 
in Eq. 60 of Ref. 22], repeated here for convenience: 


CPU'. 


0.02 


CPUimp ( k\ D ) d v A 


mm 


k\ D ’ v A \ rrii 



NfeN Picard 


(90) 


In Eq. [90l k = 2i t/L, d e is the electron skin depth, Nfe is the number of function evaluations 
(which indicates the number of repeated particle orbit computations), and Np icar d is the 
number of Picard iteration for orbit integration. We see that the speedup increases with 
larger domain sizes (where kXp -C 1, d e S> Xp), and with m* much larger than m e . Here, 
we vary the domain size L. It is expected that the cost of the implicit simulation will not 
change much with the domain size, provided that the number of cells and particles per cell 
are kept fixed, and the nonlinear iteration count is well controlled by the preconditioner. On 
the other hand, the explicit code is forced to maintain the same grid spacing and time step 
as the domain increases, to avoid numerical instability. 

Results are depicted in Fig. [5] and show that the speedup ( CPU ex /CPU im ) is closely 
proportional to (/cA^) -3 for small domain sizes, in agreement with Eq. EEH (for d — 2, and 

kXE < y^)■ As L increases (or kXp decreases), the scaling index becomes ~ 2, as expected 

from the same equation. The scaling index changes at —C- ~ \f^ ~ 0-0^5, also expected 

(note that c/va 1 in this example, since it is weakly magnetized). Overall, these results 
are in a good agreement with our simple estimate. Significant CPU speedups are possible 
for system sizes much larger than the Debye length (> 10 4 for kXp < 1CD 3 ). 
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Figure 6: Time history of the magnetic field energy for the KAW simulation, demonstrating excellent agree¬ 
ment with linear theory. 

5.2. The kinetic Alfven wave ion-ion streaming instability 

For our second test, we consider the excitation of kinetic Alfven waves (KAW) by ion-ion 
streaming in 2D-3V |54j. This is a moderately magnetized case, with an imposed external 
magnetic held. The instability is caused by interactions between the wave and the streaming 
ions. The simulation parameters are similar to those used in Ref. jH3|. The mass ratio 
is mi/m e = 25, which is nominal (we will vary this parameter later). We use ions as the 
reference species. The simulated domain is 10 (di) x 10 (di) (the unit length being the ion 
skin depth), with 64 x 64 uniformly distributed cells (with each cell size in each direction 
about 10 times larger than the Debye length) and periodic boundary conditions, and the 
average number of particles per cell of one species is 500. The external magnetic held 
is set to be Bq = 0.0667 along the x-axis. The plasma consists of Maxwellian electrons 
with v e T = 0.0745 (/3 e = 0.1), and two singly charged ion components, i.e., an ambient ion 
component a and an ion beam component b, with number densities n a = 0.6n e and Ub = 0.4n e 
(where n e is the electron density). The two ion components have v a T = 1.925 x 10 ~ 2 and 
VbT = 7.45 x 10 -3 , and a relative streaming speed with respect to each other of v a d = 1.5 va, 
and Vbd = va, with va = y/m e /mi/3 the Alfven speed along the external magnetic held 
direction. The timestep is again set to At = O.hnX 1 (about 10 times larger than the explicit 
CFL for the mass ratio considered). The simulation is started without perturbing a specihc 
wavelength. Consequently, waves with all wavelengths and with all angles to B 0 supported 
by the simulation domain are excited. Figure [HI shows the simulation result of the magnetic 
energy density, which is again in excellent agreement with linear theory (the growth rate for 
this configuration is found to be 7 = 0.225, using the same linear Vlasov code as in Ref. 

@). 
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Table 3: Solver performance for the KAW case with and without preconditioning for L x xL y = 22 (df) x22 (df), 
N x x N y = 32 x 32, and N pc = 200. For all the test cases, At = O.lu;” 1 . Convergence data is averaged over 
10 time steps. 


mi/m e 

no preconditioner 

with preconditioner 

Newton 

GMRES 

Newton 

GMRES 

25 

4 

171.9 

3.2 

1 

150 

4.5 

764 

4 

2.9 

600 

7.4 

4054.8 

4 

11.9 


Table 4: Solver performance for the KAW case with and without preconditioning, with fixed io pe /uj ce = 3. 
Other parameters are the same as in Table [3l (NC denotes “no convergence”.) 


rrii/m e 

no preconditioner 

with preconditioner 

Newton 

GMRES 

Newton 

GMRES 

150 

4.5 

738 

4 

3 

600 

5.8 

1887 

4 

3.9 

1836 

NC 

NC 

4 

5.9 

7344 

NC 

NC 

4 

12.9 


5.2.1. Preconditioner performance 

Table [3] shows the solver performance with respect to the mass ratio for the KAW test. 
From these results, it is clear that the solver performance is much more sensitive to the mass 
ratio in this example, since both unpreconditioned and preconditioned solvers degrade with 
the mass ratio (although the preconditioned one degrades less). The origin of the performance 
degradation is the moderately magnetized nature of this test case: as the mass ratio increases, 
the condition uj pe > u ce (see SecSj) is violated, thus leading to the performance degradation. 
To confirm this behavior, we conduct a slightly different test, where we fix oo pe /oo ce = 3 by 
reducing Bq as we increase the mass ratio. The results are shown in Table [4l As the mass 
ratio increases from 150 to 1836, co ce At increases from 0.4 to 1.4, and the performance of the 
preconditioner remains reasonably bounded. The performance degrades with the mass ratio 
as it increases further by a factor of 4, with the number of GMRES iterations approximately 
proportional to u; ce A t (~ 3). The reason is that, in this regime, electron Bernstein modes are 
excited at multiples of co ce , which are not captured by the moment preconditioner proposed 
in this study. The performance recovers with a reduced time step, when Bernstein modes 
are resolved (for instance, the number of GMRES iterations is about 4 and 6 for uj ce At = 1 
and 1.5, respectively). 

6. Discussion and conclusions 

In this study, we have developed a two-dimensional, conservative, fully implicit PIC 
algorithm for the (non-radiative) Vlasov-Darwin equation set. The approach builds on (and 
generalizes) a previous successful ID implementation [22}. Nonlinear convergence between 
particles and fields is enforced to a tight nonlinear tolerance via a multigrid preconditioned 
Jacobian-free Newton-Krylov solver. Particles are enslaved in the nonlinear function, so 
they do not appear explicitly as dependent variables in the nonlinear residual. As a result, 
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the nonlinear solver only iterates on fields, resulting in a much reduced memory footprint. 
Only one copy of the particle population is needed by the algorithm. The algorithmic 
advantage of the fully implicit scheme is its absolute stability, which relaxes the numerical 
instability constraints of conventional explicit schemes. This is especially beneficial for large- 
scale simulations, with system sizes much larger than the electron Debye length. 

The formulation conserves exactly local charge, total energy, and the canonical momen¬ 
tum component in the ignorable direction. It also automatically preserves exactly the two 
involutions in the discrete, namely, Gauss’s law and the Coulomb gauge (the latter being 
critical for energy conservation). This is accomplished by the minimum set of (A — 0) equa¬ 
tions proposed in this study. A (2nd-order) space-time-centered finite-difference scheme, 
together with compatible interpolations with multi-D shape functions, are key to achieve 
simultaneous conservation of local charge, global energy, and canonical momentum. 

The nonlinear, multidimensional, implicit kinetic algorithm is accelerated with a moment- 
based preconditioner. The preconditioner assumes a moderately magnetized regime where 
a; P) e > 0 J c<e , and is formulated such that it features the correct asymptotic limits for arbitrar¬ 
ily small electron mass (ambipolar limit) and arbitrarily large domains (MHD limit). The 
resulting linear systems are effectively inverted using multigrid methods, which lend the ap¬ 
proach a grid-independent convergence rate. Solver performance is also largely independent 
of the electron mass, except when the plasma becomes strongly magnetized. In strongly 
magnetized regimes, performance is recovered when 0 J ce At ~ 1. 

The algorithm has been verified against an explicit Vlasov-Maxwell solver, and against 
linear theory growth rate predictions, without resolving the Debye length or plasma wave 
frequency. A back of the envelope estimate for the explicit-to-implicit CPU speedup predicts 
that significant speedups are possible when A £>/L <C 1. Specifically, for sufficiently small 
values of A d/L, the CPU speedup scales as (L/XdY, which can be a very large number. 
Numerical experiments in this study have demonstrated speedups of more than four orders 
of magnitude. 

Future work will focus on extending the preconditioner to strongly magnetized regimes, 
and to generalize the solver to curvilinear geometry (as was done in the electrostatic case in 
Ref. |56|). 
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Appendix A. Explicit Vlasov-Maxwell solver 

We briefly describe the explicit scheme implemented in this study. The scheme is al¬ 
most the same as that proposed in Appendix B of Ref. (57^, except for aspects of charge 
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conservation. We begin with Maxwell’s equations in the (A — 

1 d 2 A 

— V 2 A + j - e o 9jV0 - e 0 ^—- = 

/i 0 ot 2 

e o V 2 0 + p = 


0) formulation: 

0, 

0, 


(A.l) 

(A.2) 


together with the leapfrog particle pusher to advance (x p ,v p ) for all particles. Equations 
IA.11 IA.2~l are discretized with central differences in time and space to obtain: 
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J 1+ 1 / 2 
hi ’ 


+ 


[A x ]i+1/2,j \ 
[Ay]i,j+y2 (- 4 - 3 ) 

[Az]iJ / 

(A.4) 


where 8 2 [A] n = ( A n+l — 2A n + A n_1 )/At 2 , and 8 t [(j)] n = (0 n+1 / 2 — 0 n-1//2 )/At. All the other 
hnite-difference notations are the same as those introduced in Sec. [3J The time-integration 
of the whole system is performed in a leapfrog fashion, with A and v p defined at integer time 
levels, and 0 and x p at half-time levels. 

The basic procedure for advancing the code for one time step is the following. At the 
beginning of each time step, we have the particles at (x p 1//2 , Up -1 ), the vector potential for the 
two previous steps (A n_1 , A n ), the static potential at 0 n-1 / 2 , and the fields (B n-1 / 2 , E n-1 / 2 ). 
Then, we perform the following steps: 

1. Advance particles to (xp" 1 " 1 ^ 2 , v p ), and accumulate the densities p n+1 / 2 and j n on the 
grid. 

2. Solve Eq. IA.4l for 0 n+1 / 2 . 

3. Solve Eq. IA.3l for A n+1 . 

To get the simulation started, we assume = 0, and End (A, 0) at t — 0 using the 

Darwin approximation (by solving Eq. [9] and I A. 41 note that the initial 5 t (j) is found from the 
solution of these equations). We use the velocity at t — 0 to reverse the particle position to 
t — — |, and then the above procedure can be used to advance the whole system. 

For exact charge conservation, we combine the charge-current interpolation scheme (Eqs. 
[T2lfT4l l38|) employed in this study with the cell-crossing scheme introduced by Ref. |4ol | . It is 
sufficient to focus the discussion on one particle substep from t n_1 / 2 to £ n+1 / 2 (the total charge 
is found by linear superposition of all particles). Note exact charge conservation is automatic 
if the particle substep is within a cell. When a particle crosses one or several cells in a single 
time step, we split the trajectory into several segments v — 1 , N u , separated by cell faces. 
For each segment, we compute position updates (Axp +1 ^ 2 ,A yp + ^ 2 )=(x p +1 —Xp,y p +1 — y p ) and 
partial contributions to the current density as: 


Up,x) 


V+ 1 / 2 

t+V 2 >j 


Up,y ) 


i'+V 2 
hi+ V 2 


h A^ A A 1/2s rw - »)s.k +,/s - *<+■/>). 
h A^ A < +,/a ^ +,/2(i » - *<) s i« +,/a - »+*/>), 


(A.5) 
(A.6) 


where x p +1 ^ 2 = ( x p +l + x p )/2 and y p +1 ^ 2 = ( y p +1 + y p )/ 2. Note that the interpolations are 
identical to those of the implicit scheme (Eqs. fl3 l fT2lh The total current density is obtained 
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by summing up all partial contributions for all particles (very similarly to the orbit-averaging 
procedure in the implicit case): 


jx(y) 


N „-1 


EE 


f+y 2 

J p,x(y)' 


p u=0 


For the ^-direction, since there is no cell-crossing, we simply have 
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Almost exactly the same procedure outlined in Sec. 13.11 can be used to prove that the 
following charge conservation equation is satisfied: 


(p)iT /2 - ( p)l 

At 


n—1/2 


(jx)i+y2,j ( jx)i-i/2,j (jy)i,j+y2 1 /2 _ ^ y\ 


Ax 


Ay 


With Eqs. IA.7I and IA.31IA.4l being satisfied, it is straightforward to prove (by taking the 
numerical divergence of Eq. IA.3D that V • A = 0 is also satisfied numerically at all times if 
it is satisfied initially. 
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